Benchmarking attribute cardinality maps for database systems using the TPC-D specifications

نویسندگان

  • B. John Oommen
  • Murali Thiyagarajah
چکیده

Benchmarking is an important phase in developing any new software technique because it helps to validate the underlying theory in the specific problem domain. But benchmarking of new software strategies is a very complex problem, because it is difficult (if not impossible) to test, validate and verify the results of the various schemes in completely different settings. This is even more true in the case of database systems because the benchmarking also depends on the types of queries presented to the databases used in the benchmarking experiments. Query optimization strategies in relational database systems rely on approximately estimating the query result sizes to minimize the response time for user-queries. Among the many query result size estimation techniques, the histogram-based techniques are by far the most commonly used ones in modern-day database systems. These techniques estimate the query result sizes by approximating the underlying data distributions, and, thus, are prone to estimation errors. In two recent works , we proposed (and thoroughly analyzed) two new forms of histogram-like techniques called the rectangular and trapezoidal attribute cardinality maps (ACM), respectively, that give much smaller estimation errors than the traditional equi-width and equi-depth histograms currently being used by many commercial database systems. This paper reports how the benchmarking of the Rectangular-ACM (R-ACM) and the Trapezoidal-ACM (T-ACM) for query optimization can be achieved. By conducting an extensive set of experiments using the acclaimed TPC-D benchmark queries and database , we demonstrate that these new ACM schemes are much more accurate than the traditional histograms for query result size estimation. Apart from demonstrating the power of the ACMs, this paper also shows how the TPC-D benchmarking can be achieved using a large synthetic database with many different patterns of synthetic queries, which are representative of a real-world business environment.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Generation for Application-Specific Benchmarking

The Transaction Processing Council (TPC) has played a pivotal role in the database industry’s growth over the last twenty-five years. However, its handful of domain-specific benchmarks are increasingly irrelevant to the multitude of data-centric applications, and its top-down process is slow. This mismatch calls for a paradigm shift to a bottomup community effort to develop tools for applicatio...

متن کامل

Queue Weighting Load-Balancing Technique for Database Replication in Dynamic Content Web Sites

There is an ever increasing need for database replication in dynamic web sites to improve availability. However, the main problem in replication is load balancing. This paper presents new load balance technique to increase the performance of database replication in dynamic web depending on the type and weight of database server queue. We attempt at evaluation various load distribution policies,...

متن کامل

Benchmarking Hybrid OLTP&OLAP Database Systems

Recently, the case has been made for operational or real-time Business Intelligence (BI). As the traditional separation into OLTP database and OLAP data warehouse obviously incurs severe latency disadvantages for operational BI, hybrid OLTP&OLAP database systems are being developed. The advent of the first generation of such hybrid OLTP&OLAP database systems requires means to characterize their...

متن کامل

Comparative Performance Evaluation of E-commerce Technologies: a Tpc-w-based Benchmarking Tool

E-commerce systems are an important new application area in which maintaining good performance under scaling workloads is crucial to business success. The TPC-W benchmark is a benchmark designed to exercise a web server and associated transaction processing system in representative e-commerce scenarios. Whilst the benchmark specifies the architecture of the system, and the form of the interacti...

متن کامل

Benchmarking with TPC-H on Off-the-Shelf Hardware - An Experiments Report

Most medium-sized enterprises run their databases on inexpensive off-the-shelf hardware; still, answers to quite complex queries, like ad-hoc Decision Support System (DSS) ones, are required within a reasonable time window. Therefore, it becomes increasingly important that the chosen database system and its tuning be optimal for the specific database size and design. Such optimization could occ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics : a publication of the IEEE Systems, Man, and Cybernetics Society

دوره 33 6  شماره 

صفحات  -

تاریخ انتشار 2003